May 3, 2016
d =
(Martin 2011, Cukier (2012))
A complete description of a network and its topology is infeasible \(\Rightarrow\) we can study the statistics of a complex network (i.e., clustering coeffients, numbers of triangles, average degree, etc.)
Let
Goal: Estimate some statistic \(T_n\) of \(G\) using a block bootstrap.
snowboot package on CRANData: graph \(G_n\); number of seeds \(m\), \(m < n\); number of waves \(d\)
Result: sample of \(m\) seeds with \(d\) waves around each seed.
seed \(=\) randomly sample without replacement \(m\) vertices from \(G_n\)
for \(q = 1, \dots, m\) do
start with original \(G_n\) and \(seed_q\)
for \(w = 1, \dots, d\) do
\(wave_w =\) select all immediate neighbors using the existing edges
remove the used edges
end
\(patch_q =\) join the current \(seed_q\) and all \(wave_w\) keeping the repeated elements
end
join all \(m\) patches, keeping all of the repeated elements
Additionally, sample non-seeds according to non-weighted or weighted selection.
Repeat \(B\) times and calculate the estimate, \(T_n^*\) for each replicate.
Note: The authors recommend running this procedure \(T\) times to get \(T\) bootstrap distributions because high variability among LSMIs
\[T_n^* = \frac{\dfrac{1}{m}\sum_{q=1}^{m}d_q + D(1-\hat{p}_0)\dfrac{1}{m}\sum_{q=1}^m \tilde{A}_{NSq}}{\dfrac{1}{m}\sum_{q=1}^m 1 + D\dfrac{1}{m}\sum_{q=1}^m \tilde{B}_{NSq}}\]
Where
Conditions:
Theorem:
Let's try to estimate \(T_n =\) the # of edges in \(G\) \(= e\) to start.
Data: graph \(G_n\); average block size, \(m = \frac{1}{n}\sum\limits_{i =1}^n \{d_i + 1\}\); number of blocks, \(k = \lfloor\frac{n}{m}\rfloor\); intra-block connection probability, \(\hat{p}\).
Result: \(B\) estimates of \(T_n\)
for \(i = 1, \dots, B\) do
start with original \(G_n\), select \(k\) indices \(j \in 1, \dots, n\) with replacement.
for \(j = 1, \dots, k\) do
\(B_j =\) select all vertices \(k \in N_1(j)\) and vertex \(j\), keeping all the existing edges
end
join the disconnected subgraphs (\(B^*_1, \dots, B^*_k\)) via iid \(Bern(\hat{p})\) draws. The result is \(G^*_i\).
Calculate \(T_n^*\) for \(G^*_i\).
end
\[T_n^* = |E(G^*_i)| = e^*_i\]
\(\hat{p} = \frac{2(m - 2)e}{nm(n-m)}\)
See Appendix for details.
Cukier, Jerome. 2012. “Events in the Game of Thrones.” December. http://www.jeromecukier.net/projects/agot/events.html.
Martin, George RR. 2011. A Dance with Dragons. New York, NY, USA: Bantam Books.
Thompson, Mary E., Lilia L. Ramirez Ramirez, Vyacheslav Lyubchich, and Yulia R. Gel. 2016. “Using the Bootstrap for Statistical Inference on Random Graphs.” Canadian Journal of Statistics 44 (1): 3–24.